Phonetic Database of the Russian Speech Variability

نویسندگان

  • Vladimir I. Kuznetsov
  • Tatiana Y. Sherstinova
چکیده

Database of contemporary Russian speech is being created for the Computer Fund of the Russian Language. Database consists of the corpus of sound files and their descriptions. Database sound material is formed by recordings of the Phonetically Representative Texts pronounced by 1) standard Russian speakers, 2) regional Russian speakers, 3) speakers from the former USSR republics to which Russian is a second language, and 4) foreigners from European, Asian and American countries. The detailed description is being made for each syllable of the database. It comprises in particular the "real" transcription, "ideal" phonemic and phonetic transcriptions, attributes of the sound, duration of the signal segments, acoustic features, phonetic and paralinguistic comments. Database may be used in phonetic and related studies, for building of speech algorithms, and evaluation of speech resources, speech technologies and products.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonetics of Emotion in Russian Speech

This paper provides a description of the structure and goals of the database of Russian Language Affective (emotional) utterances. It also reports some preliminary results of an experimental phonetic analysis of the acoustic characteristics of emotional utterances (surprise, happiness, anger, sadness and fear) vs. neutral ones in Russian. The study utilizes 600 database utterances by 10 speaker...

متن کامل

An Algorithm of Generation of Alternative Phonetic Transcriptions for Spontaneous Russian Speech Recognitions

For increasing accuracy of automatic speech recognition it is necessary to create alternative transcriptions of words that allows to take into account variability of spontaneous speech. An algorithm of creation of alternative transcriptions by applying rules of reduction and assimilation in spontaneous speech is proposed in this work. Testing of the developed module was made with the text corpu...

متن کامل

Structure and annotation of Polish LVCSR speech database

This paper reports on the problems occurring in the process of building LVCSR (Large Vocabulary Continuous Speech Recognition) corpora based on the internal evaluation of the Polish database JURISDIC. The initial assumptions are discussed together with technical matters concerning the database realization and annotation results. Providing rich database statistics was considered crucial especial...

متن کامل

Language Features of Russian Texts of Engineering Discourse

The Article is devoted to the applied problem of identifying the linguistic features of engineering texts. The study of Russian-language texts of engineering discourse is usually of an applied nature, in our case, this applied research is caused by the need to teach foreigners who receive professional engineering education in Russia and in Russian language. The object of the research is the Rus...

متن کامل

Development of multi-voice and multi-language TTS synthesizer (languages: Belarussian, Polish, Russian)

The paper describes some results of the research which aiming at filling the gap in introducing and promoting computerized speech technology for Slavonic languages, in particular, a technology of TTS synthesis for Belarusian, Polish and Russian. A typological analysis of the peculiarities of phonemic and allophonic systems of Belarussian, Polish and Russian languages is given. Based on the resu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999